Calculating Haversine Distance with Python

November 9, 2025
Marcus Wyche

For a project I was working on I needed to determine the distance between multiple latitude longitude coordinates. After doing some googling it was odvious that Haversine Distance is what I should use for this.

What is Haversine Distance

According to Wikipedia "Haversine Formula determines the great-circle distance between two points on a sphere given their longitudes and latitudes". The formula for calculating Haversine distance is:

  • a = sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)
  • c = 2 ⋅ atan2( √a, √(1−a) )
  • d = R ⋅ c

In formula above φ is the latitude and λ is longitude w/ R representing the Radius of the earth(a sphere, hush flat earthers!). Then d is representing the haversine distance.

To gain a better understanding of the formula I wanted to implement it in Python. I'm no mathematician but I found this article by Simon Kettle easy to follow for understanding how to implement this in Python.

"""
Utility class to use python to calculate the haversine distance between two longitude
and latitude points.
"""
from math import radians, sin, cos, atan2, sqrt

# Radius of earth in meters
RADIUS_OF_EARTH = 6371000

def calculate_haversine_distince(lat_long_start:list, lat_long_end:list):
    """Calculate the haversine distance."""
    # Taken from here
    # https://community.esri.com/t5/coordinate-reference-systems-blog/distance-on-a-sphere-the-haversine-formula/ba-p/902128

    # Direct Unpack the
    lat1, lon1 = lat_long_start
    lat2, lon2 = lat_long_end

    # Radians
    phi_1 = radians(lat1)
    phi_2 = radians(lat2)

    delta_phi = radians(lat2 - lat1)
    delta_lambda = radians(lon2 - lon1)

    a = sin(delta_phi / 2.0) ** 2 + cos(phi_1) * cos(phi_2) * sin(delta_lambda / 2.0) ** 2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    # Output distance in meters
    meters = RADIUS_OF_EARTH * c

    return meters

You can see the source code for this here.

Implementing the formula led to greater understanding and some new insights for me:

  • Radians above we're converting degrees(latitude and longitude) to radians because this is the standard unit of angular measure used in the Haversine formula.
  • 6,371,000 is the mean earth radius in meters and is used to get to meters in the formula above.

Testing

Implementing the raw formula in Python is great but I also need to ensure that the output from the formula above is correct. In order to do that I'm going to use Scikit-learn implementation of Haversine Distance which can be seen here .

"""
Test for haversine distance.
"""
import logging as log
from math import radians
import pytest
from sklearn.metrics.pairwise import haversine_distances
from python_examples.math.haversine_distance import calculate_haversine_distince, RADIUS_OF_EARTH

@pytest.mark.parametrize("lat_long_origin,lat_long_destination", [
    # Scikit Learn example Ezeiza Airport -> Charles de Gaulle Airport
    pytest.param(
        [-34.83333, -58.5166646],
        [49.0083899664, 2.53844117956]
    )
])
def test_calculate_haversince_distance(lat_long_origin:list, lat_long_destination:list):
    """Test haversine_distance against """
    log.info(f"{lat_long_origin} - {lat_long_destination}")
    origin_in_radians = [radians(_) for _ in lat_long_origin]
    destination_in_radians = [radians(_) for _ in lat_long_destination]
    distance_matrix_result = haversine_distances([origin_in_radians, destination_in_radians])
    km_distance_matrix_result = distance_matrix_result * RADIUS_OF_EARTH / 1000

    haversine_distance = calculate_haversine_distince(lat_long_origin, lat_long_destination)
    km_haversine_distance = round(haversine_distance / 1000, 8)
    sklearn_distance = km_distance_matrix_result[0][1]
    log.info(f"Raw Calc={km_haversine_distance}, Sklearn Calc={sklearn_distance}")
    # Rounding both to 8 digits of accuracy...
    assert km_haversine_distance == round(km_distance_matrix_result[0][1], 8)

You can see the source for this here.

Insights from doing the testing:

  • Outside of greater understanding and not wanting to bring scikit-learn into your dependencies due to it's size there's no real need to implement Haversine Distance in python on your own. Scikit learn returns a distance matrix and can natively handle N latitude and longitude points finding the distance between all of them.
  • Distance Matrix is a useful data structure for seeing the distance between multiple points.
  • Getting the shortest distance from the 2D array returned by scikit-learn can be achieved by doing the following:

def test_shortest_distance():
    # Lincoln Memorial Latlong
    lincoln_memorial = [38.889248, -77.050636]

    # Ducinni's off U St
    ducinnis_pizza = [38.91706701509112, -77.04118715785616]

    # Admo Jumbo Slice
    jumbo_slice_pizza = [38.92105748065034, -77.04170211776945]

    # Manny Olgas Near U
    manny_olgas = [38.91543818061708, -77.03174726038766]

    # Based off Haversine which is closest to Lincoln Memorial
    dataset = [
        {'name': 'Lincoln Memorial', 'geocode': lincoln_memorial},
        {'name': "Ducinni's Pizza", 'geocode': ducinnis_pizza},
        {'name': 'Jumbo Slice Pizza', 'geocode': jumbo_slice_pizza},
        {'name': "Manny Olga's", 'geocode': manny_olgas}
    ]

    # Put all these in radians
    radian_distances = list(map(lambda x: [radians(_) for _ in x['geocode']], dataset))

    # Calculate the Haversine Distances
    distance_matrix_result = haversine_distances(radian_distances)

    # Pull out the first Entry in the 2D array
    km_distance_matrix_result = distance_matrix_result * RADIUS_OF_EARTH / 1000
    log.info(f"\nDistance Matrix in Kilometers:\n {km_distance_matrix_result}")

    # Get the index of the minimum distance for the lincoln memorial
    np_array = np.array(km_distance_matrix_result[0][1:])
    idx_min = np.argmin(np_array)
    # Closest Should be Ducinnis
    assert idx_min + 1 == 1
    log.info(f"\nClosest Jumbo Slice spot to Lincoln Memorial by Haversine Distance is {dataset[idx_min + 1]['name']}")

Code can be seen here.

With example output below:

Test Output

From all this I've learned Haversine Distance is a useful formula for calculating distances between geo-coordinates. In general I'd just use scikit-learn in order to get these distances but it's always great to have some base understanding of the formula you're using.